\chapter{Introduction}

\section{What is PyTOUGH?}
\index{PyTOUGH}

PyTOUGH (\textbf{Py}thon \textbf{TOUGH}) is a set of Python software routines for making it easier to use the TOUGH2 geothermal reservoir simulator. Using PyTOUGH, it is possible to automate the creation and editing of TOUGH2 model grids and data files, and the analysis and display of model simulation results.

\section{What are TOUGH2 and AUTOUGH2?}
\index{TOUGH2}
\index{TOUGH2!AUTOUGH2}

TOUGH2 \citep{tough2} is a general-purpose simulator for modelling subsurface fluid and heat flow, often used for simulating geothermal reservoirs.

AUTOUGH2 is the University of Auckland version of TOUGH2.  The main differences between AUTOUGH2 and TOUGH2 are:

\begin{itemize}
  \item \textbf{EOS handling}: AUTOUGH2 includes all different equations of state (EOSes) in a single executable program, whereas TOUGH2 uses different executables for each EOS.  As a result, the main input data file for an AUTOUGH2 simulation also includes extra data blocks to specify which EOS is to be used.
  \item \textbf{Generator types}: AUTOUGH2 includes a variety of extra generator types developed for geothermal reservoir simulation (e.g. makeup and reinjection wells).
\end{itemize}

\index{TOUGH2!TOUGH2-MP}
\index{TOUGH2!TOUGH+}
TOUGH2\_MP \citep{tough2mp} is a multi-processor version of TOUGH2.  TOUGH+ is a redeveloped version of TOUGH2, with a more modular code structure implemented in Fortran-95.

\subsection{TOUGH2 data files}
\index{TOUGH2 data files}
\index{TOUGH2}
\index{TOUGH2!AUTOUGH2}

TOUGH2 takes its main input from a \textbf{data file}, which contains information about the model grid, simulation parameters, time stepping, sources of heat and mass etc.  The data file formats for TOUGH2 and AUTOUGH2 are almost identical, with minor differences.  TOUGH2\_MP can read TOUGH2 data files, but also supports some extensions (e.g. for 8-character instead of 5-character block names) to this format.  PyTOUGH does not currently support the TOUGH2\_MP extensions.  TOUGH+ data files can also have some extensions, which PyTOUGH does not support as yet.

Because TOUGH2 uses a finite volume formulation, the only model grid data it needs are the volumes of the grid blocks and the distances and areas associated with the connections between blocks.  Hence, the TOUGH2 data file need not contain any information about the specific locations of the blocks in space, and it contains no information about the locations of the vertices or edges of the blocks.  This makes it easy to use TOUGH2 to simulate one-, two- or three-dimensional models, all with the same format of data file.  However, this lack of reference to any coordinate system also makes it more difficult to generate model grids, and to visualise simulation results in space.

\subsection{MULgraph geometry files}
\index{MULgraph geometry!files}

For this reason, a separate \textbf{geometry file} can be used to create grids for TOUGH2 simulations and visualise simulation results.  The geometry file contains information about the locations of the grid block vertices.  The geometry file can be used to visualise results using the \textbf{MULgraph} graphical post-processor for TOUGH2 and AUTOUGH2 \citep{mulgraph}, developed at the University of Auckland in the 1990s.

The MULgraph geometry file assumes the grid has a layered structure, with blocks arranged in layers and columns, and the same arrangement of columns on each layer.  (At the top of the model grid, blocks in some columns may be missing, to allow the grid to follow the surface topography.)

If you do not have a MULgraph geometry file for your model, it is easy to create one for a \hyperref[sec:mulgrid:rectangular]{rectangular} grid. In fact, PyTOUGH is able to \hyperref[sec:t2grid:rectgeo]{reverse-engineer} a MULgraph geometry from a TOUGH2 data file containing a rectangular grid.

A specification of the MULgraph geometry file format can be found in Appendix \ref{geometry_file_format}.

\subsection{TOUGH2 listing files}
\index{TOUGH2 listing files}

The output from TOUGH2 is written to a \textbf{listing file}, which is a text file containing tables of results for each time step (or only selected time steps, if preferred).  At each time step there is an `element table', containing results for block properties (e.g. pressure, temperature etc.).  There may also be a `connection table', containing results for flows between blocks, and a `generation table', containing results (e.g. flow rates) at the generators in the model (e.g. wells).

The formats of the listing files produced by TOUGH2, AUTOUGH2, TOUGH2\_MP and TOUGH+ are all slightly different, and also vary depending on the EOS used.  However, PyTOUGH attempts to detect and read all of these formats.

\section{What is Python?}
\index{Python}
\index{Python!3.x}

Python is a general-purpose programming language.  It is free and open-source, and runs on many different computer operating systems (Linux, Windows, Mac OS X and others).  Python can be downloaded from the Python website (\url{http://www.python.org}), which also contains detailed reference material about the Python language.  If you are using Linux you probably already have Python, as it is included in most Linux distributions.

PyTOUGH should run on any version of Python 2.x newer than 2.4 (though version 2.6 or newer is recommended). PyTOUGH version 1.5 or later should also run on Python 3.x.

If you are unfamiliar with Python (even if you have used another programming language before), it is highly recommended that you do one of the many Python tutorials available online, e.g.

\begin{itemize}
  \item \url{http://docs.python.org/tutorial/}
  \item \url{http://wiki.python.org/moin/BeginnersGuide}
\end{itemize}

\subsection{Python basics}

\subsubsection{Objects}
\index{Python!objects}

Python is what is known as an \textbf{object-oriented} language, which means that it is possible to create special customised data types, or `classes', to encapsulate all the properties and behaviour of the things (objects) we are dealing with in a program.  This is a very useful way of simplifying complex programs.  (In fact, in Python, everything is treated as an object, even simple things like integers and strings.)

For example, in a TOUGH2 model grid we have collections of grid blocks, and we need to store the names of these blocks and their volumes and rock types.  In a non-object-oriented language, these could be stored in three separate arrays: a string array for the names, a real (or `float') array for the volumes and another string array for the rock types.  In an object-oriented language like Python, we can define a new data type (or `class') for blocks, which holds the name, volume and rock type of the block.  If we declare an object called \texttt{blk} of this block class, we can access or edit its volume by referring to \texttt{blk.volume}.  In this way, we can store our blocks in one single array of block objects.  When we add or delete blocks from our grid, we can just add or delete block objects from the array, rather than having to keep track of three separate arrays.

In general, an object not only has \textbf{properties} (like \texttt{blk.volume}) but also \textbf{methods}, which are functions the object can carry out.  For example, if we wanted to rotate a MULgraph geometry file by $30\degree$, we could do this in PyTOUGH by declaring a MULgraph geometry file object called \texttt{geo}, and calling its \texttt{rotate} method: \texttt{geo.rotate(30)}.  The methods of an object are accessed in the same way that its properties are accessed: by adding a dot (.) after the object's name and then adding the name of the property or method.  Any arguments of the method (e.g. the angle in the \texttt{rotate} function above) are added in parentheses afterwards.

\subsubsection{Lists, dictionaries, tuples and sets}
\index{Python!lists}
\index{Python!dictionaries}
\index{Python!tuples}
\index{Python!sets}

Most programming languages have simple data types built in, e.g. float, double precision or integer numbers, strings, and arrays of these.  Python has some other data types which are very useful and are used a lot.

The first of these is the \textbf{list}.  A list can contain any ordered collection of objects, of any type, or even of different types, and is delimited by square brackets.  So for example we can declare a list \texttt{things = [1, 'two',3.0]} containing an integer, a string and a float.  We can access the list's elements in much the same way as we access the elements of an array, for example \texttt{things[1]} would return the value \texttt{'two'} (note that in Python, as in most other languages besides Fortran, the indices of arrays and lists start at 0, not 1).  Additional elements can be added to a list at any time, without having to re-declare the size of the list: for example, \texttt{things.append('IV')} would add an extra element to the end of the list, giving it the value \texttt{[1, 'two', 3.0, 'IV']}.  It is also possible to remove elements from a list, e.g. \texttt{things.remove(3.0)}, which would give our list the value \texttt{[1, 'two', 'IV']}.

Another useful Python data type is the \textbf{dictionary}.  Dictionaries are mainly used to store collections of objects (again, of any type or of different types) that are referenced by name rather than by index (as in an array or list).  A dictionary is delimited by curly brackets.  So for example we can declare a dictionary \texttt{phone = \{'Eric':8155, 'Fred':2350, 'Wilma':4667\}} and then find Fred's phone number from \texttt{phone['Fred']}, which would return \texttt{2350}.  For TOUGH2 models, blocks, generators, rock types and other objects are often referred to by name rather than index, so dictionaries are an appropriate way to store them.

A third Python data type, similar to a list, is the \textbf{tuple}.  A tuple is essentially a list that cannot be changed, and is often used just for grouping objects together.  A tuple is delimited by parentheses.  For example, \texttt{things = (1, 'two', 3.0)} declares a tuple with three elements.  We can still refer to the elements of a tuple using e.g. \texttt{a[1]}, but we cannot assign new values to these elements or add or remove elements from the tuple once it has been declared.

Python also has a \textbf{set} data type, which represents a mathematical set - an unordered collection of objects.  One of the useful aspects of sets is that they cannot contain duplicate items.  As a result, for example, duplicate items can be removed from a list \texttt{x} simply by converting it to a set, and then back to a list: \texttt{x = list(set(x))}.

\subsection{How to run Python}
\index{Python!running}

Python can be run either interactively or via scripts.

\subsubsection{Running Python interactively}
\label{python_interactive}

The simplest way to run Python interactively is just by typing \texttt{python} at the command line.  (On Windows the directory that Python was installed into may have to be added to your \texttt{PATH} environment variable first.) The command line then becomes an interactive Python environment in which you can type Python commands at the Python command prompt \texttt{>>>}, e.g. in Windows:

\begin{verbatim}
C:\>python
Python 2.6.4 (r264:75708, Oct 26 2009, 08:23:19) [MSC v.1500 32 bit (Intel)]
on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> things = [1, 'two', 3.0]
>>> print things[1]
two
>>> exit()

C:\>
\end{verbatim}

In the interactive Python environment, you can view help on the properties and methods of any Python object by typing \texttt{help(\emph{objectname})}, where \texttt{\emph{objectname}} is the name of an object that has been declared.  This will list the properties and methods of the object and a description of each one.

You can exit the interactive Python environment by typing \texttt{exit()} or \texttt{Ctrl-Z} on Windows, or \texttt{Ctrl-D} on Linux.

\subsubsection{Python scripts}
\index{Python!scripts}

The real power of Python, however, lies in using it to write \textbf{scripts} to automate repetitive or complex tasks.  You can just type Python commands into a text file, save it with the file extension \texttt{.py}, and execute it by typing \texttt{python \emph{filename.py}}, where \texttt{\emph{filename.py}} is the name of the file.  (Once again, on Windows the directory that Python was installed into may have to be added to your \texttt{PATH} environment variable first.)

You can also debug a Python script using the `pdb' command-line debugger.  Typing \texttt{python -m pdb \emph{filename.py}} will start debugging the script \emph{filename.py}.

It is also possible to run a Python script from within the interactive Python environment.  From the Python environment command line, typing \texttt{execfile(\emph{'filename.py'})} will execute the script \emph{filename.py}.

\subsection{Python libraries}
\label{pylibraries}
\index{Python!libraries}

Python comes with a large number of features already built in, but for specialised tasks, additional \textbf{libraries} of Python software can be imported into Python as you need them.  PyTOUGH itself is a set of such libraries, and it in turn makes use of some other third-party Python libraries.

\subsubsection{Numerical Python}
\index{Python!Numerical Python (numpy)}

The most important of these is Numerical Python (`numpy'), which you will need to have installed on your computer before you can use PyTOUGH at all \footnote{PyTOUGH will run using Numeric, a now-obsolete predecessor of Numerical Python, though the PyTOUGH plotting functions will not work.  In general it is recommended to use Numerical Python if possible.}.  Numerical Python adds a special \texttt{numpy.array} class for fast multi-dimensional arrays, which PyTOUGH makes heavy use of, and a whole range of other features, e.g. linear algebra routines, Fourier transforms and statistics.

\subsubsection{Other libraries}

\index{Python!matplotlib}
\index{Python!scipy}
\index{Visualization Tool Kit (VTK)}

Some parts of PyTOUGH use other Python libraries.  You do not need these libraries unless you are using the parts of PyTOUGH that depend on them.

\begin{itemize}
\item \textbf{Scientific Python} (\url{http://www.scipy.org/}), a library of advanced mathematical functions (e.g. interpolation, calculus, optimisation)
\item \textbf{matplotlib} (\url{http://matplotlib.sourceforge.net/}), a library of graphical plotting routines
\item \textbf{VTK}, a Python interface to the Visualization Tool Kit (\url{http://www.vtk.org/}), a library for 3D visualisation of data via VTK itself, or software such as ParaView, Mayavi etc.
\end{itemize}

\subsection{Installing third-party Python libraries}

\subsubsection{Linux}

On Linux you can install third-party Python libraries via your package manager, e.g. on Debian or Debian-based distributions like Ubuntu:

\texttt{apt-get install python-numpy python-scipy python-matplotlib python-vtk}

\subsubsection{MS Windows}

On MS Windows systems, the easiest way to install these libraries is by using the \textbf{pip} tool. This is a Python package management tool which allows the user to install Python packages and also manages dependencies (when one package depends on other packages).

If you have Python version 2.7.9 or a more recent one then you should have pip already available on your system. If not, the best thing is probably to upgrade to a newer version of Python. If that is not possible for some reason, you can install pip separately by following the instructions at \url{https://pip.pypa.io/en/stable/installing/}.

It is recommended to download libraries to use with PyTOUGH (as *.whl files) from Christoph Gohlke's Python package site:

\url{http://www.lfd.uci.edu/\textasciitilde gohlke/pythonlibs/}

Here you can find Windows Python packages for numpy, scipy, matplotlib and VTK (and lots of others as well). Be sure to choose the version of each package appropriate for your version of Python. For example, if you are using 64-bit Python 2.7, then choose *.whl files with names including ``cp27'', and ``amd64''.

Once you have downloaded them, you can install them by opening a command prompt in the download directory and typing \texttt{pip install} followed by the exact name of the *.whl file.

\subsubsection{Importing libraries}
\index{Python!libraries!importing}

To use any Python library, you just need to \textbf{import} it first.  For example, once you have installed Numerical Python, you can make it available (in the interactive Python environment or in a Python script) by typing the command \texttt{import numpy}, or alternatively \texttt{from numpy import *}.  This imports all classes and commands from Numerical Python and makes them available for use.  (You can also import only parts of a library rather than the whole thing, e.g. \texttt{from numpy import linalg} just imports the linear algebra routines from Numerical Python.)

When you import a library, you can also change its name.  For example, PyTOUGH imports Numerical Python using the command \texttt{import numpy as np}, which renames \texttt{numpy} as the abbreviated \texttt{np}.  This means it can, for example, access the Numerical Python \texttt{numpy.array} data type as \texttt{np.array}.  It also means you have access to Numerical Python as \texttt{np} in your own scripts and in the interactive Python environment, without having to import it yourself.

\section{Installing and accessing PyTOUGH}
\label{installing}
\index{PyTOUGH!website}
\index{PyTOUGH!installing}

The latest version of PyTOUGH can always be downloaded from the PyTOUGH website:

\url{https://github.com/acroucher/PyTOUGH/}

First, make sure you have Python, and the Numerical Python library (see section \ref{pylibraries}) installed.  On Windows, you may have to add the directory where Python has installed (e.g. \texttt{C:\textbackslash Python27} or similar, depending on which version you have) to your PATH environment variable, before you can access Python from the command line.

On the PyTOUGH website, click the `Download ZIP' button at the right of the page:

\url{https://github.com/acroucher/PyTOUGH/archive/master.zip}

to download PyTOUGH as a .zip file.  Unzip this to any directory on your computer.  This will create a directory containing a file called \texttt{setup.py}.

To install PyTOUGH, you will need administrator (`root' on Linux) privileges on your computer.  As administrator, open a command prompt, navigate to this new directory and type:

\begin{verbatim}
python setup.py install
\end{verbatim}

You should now be able to import the PyTOUGH libraries into the Python interactive environment or your Python scripts, from any directory on your computer.  For example, you can import the MULgraph geometry library using \texttt{from mulgrids import *} (see chapter \ref{mulgrids}).

\section{Testing PyTOUGH}
\label{unittests}
\index{PyTOUGH!testing}
\index{unit tests}

PyTOUGH includes a suite of ``unit tests'' which can be used to verify that it is working correctly. These are located in the \texttt{tests/} directory, which includes a number of Python scripts for testing individual PyTOUGH modules.

These unit test modules may be run individually, the same way as any other Python script would be run. If the tests in the script all pass, the last message printed out to the console will read \texttt{OK}. If not, details will be output regarding which tests did not pass.

It is also possible to run the unit tests for all modules by running the following command in the \texttt{tests/} directory:

\begin{verbatim}
python -m unittest discover
\end{verbatim}

or with the \texttt{-v} (verbose) flag to output more detail on which tests are being run:

\begin{verbatim}
python -m unittest discover -v
\end{verbatim}

\section{Licensing}
\index{PyTOUGH!license}

PyTOUGH is free software, distributed under the GNU Lesser General Public License (LGPL).  For more information, see \url{http://www.gnu.org/licenses/}.
